A New Clustering Algorithm Based on Near Neighbor Influence
نویسنده
چکیده
Clustering has been used in many areas. It is an unsupervised learning method which tries to find some distributions and patterns in unlabeled data sets. Although clustering algorithms have been studied for decades, none of them is all purpose. This paper presents a new clustering algorithm, Clustering based on Near Neighbor Influence (CNNI), an improved version in time cost of CNNI algorithm (ICNNI), and a variation of CNNI algorithm (VCNNI). They are inspired by the idea of near neighbors and the superposition principle of influence. In order to clearly describe the three algorithms, it lists three basic concepts (near neighbor point set, grid cell, and near neighbor grid cell set) and introduces two important concepts (near neighbor influence and a kind of similarity measure). In the simulations, four famous clustering algorithms (K-Means, FCM, AP, and DBSCAN) are used as comparative algorithms. From the simulated experiments of some artificial data sets and some real data sets, we observed that CNNI, ICNNI, and VCNNI can find those obvious clusters and get better (or similar) clustering results than (or with) K-Means, FCM, and AP for some data sets. We also observed that ICNNI is faster than CNNI with the same clustering results, CNNI and ICNNI are faster than AP with better or similar clustering quality, CNNI needs less space than VCNNI and DBSCAN, and VCNNI gets similar clustering results with DBSCAN. Especially, CNNI, ICNNI, and VCNNI can easily find some noises or isolates. At last, it gives several solid and insightful future research suggestions. 2015 Elsevier Ltd. All rights reserved.
منابع مشابه
Extracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering
Although many studies have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper is aimed at proposing an efficient approach to elicit prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data in order to convert a blind clustering problem into a semi-supervised o...
متن کاملData Clustring Using A New CGA(Chaotic-Generic Algorithm) Approach
Clustering is the process of dividing a set of input data into a number of subgroups. The members of each subgroup are similar to each other but different from members of other subgroups. The genetic algorithm has enjoyed many applications in clustering data. One of these applications is the clustering of images. The problem with the earlier methods used in clustering images was in selecting in...
متن کاملData Clustring Using A New CGA(Chaotic-Generic Algorithm) Approach
Clustering is the process of dividing a set of input data into a number of subgroups. The members of each subgroup are similar to each other but different from members of other subgroups. The genetic algorithm has enjoyed many applications in clustering data. One of these applications is the clustering of images. The problem with the earlier methods used in clustering images was in selecting in...
متن کاملA Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...
متن کاملAn Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem
Swarm Intelligence (SI) is an innovative artificial intelligence technique for solving complex optimization problems. Data clustering is the process of grouping data into a number of clusters. The goal of data clustering is to make the data in the same cluster share a high degree of similarity while being very dissimilar to data from other clusters. Clustering algorithms have been applied to a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Expert Syst. Appl.
دوره 42 شماره
صفحات -
تاریخ انتشار 2015